A Dataset-Driven Parameter Tuning Approach for Enhanced K-Nearest Neighbour Algorithm Performance

نویسندگان

چکیده

The number of Neighbours (k) and distance measure (DM) are widely modified for improved kNN performance. This work investigates the joint effect these parameters in conjunction with dataset characteristics (DC) on Euclidean; Chebychev; Manhattan; Minkowski; Filtered distances, eleven k values, four DC, were systematically selected parameter tuning experiments. Each experiment had 20 iterations, 10-fold cross-validation method thirty-three randomly datasets from UCI repository. From results, average root mean squared error is significantly affected by type task (p<0.05, 14.53% variability effect), while DC collectively caused 74.54% change RMSE DM accumulated least 25.4%. interaction k, resulted DM='Minkowski', 3≤k≤20, 7≤target dimension ≤9, sample size (SS) >9000, as optimal performance pattern classification tasks. For regression problems, experimental configuration should be7000≤SS≤9000; 4≤number attributes ≤6, = 'Filtered'. performed most influential determinant, followed DM. variation accuracy resulting changes values only occurs chance, it does not depict any consistent pattern, its value other yielded a statistically insignificant (p>0.5). As further work, discovered patterns would serve standard reference comparative analytics algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

k-Nearest Neighbour Classifiers

Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier – classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance today because issues of poor run-time performance is not such...

متن کامل

Enhanced Nearest Neighbour

Multimedia databases usually deal with huge amounts of data and it is necessary to have an indexing structure such that eecient retrieval of data can be provided. R-Tree with its variations, is a commonly cited indexing method. In this paper we propose an improved nearest neighbor search algorithm on the R-tree and its variants. The improvement lies in the removal of two hueristics that have be...

متن کامل

Generalized K-Nearest Neighbour Algorithm- A Predicting Tool

k-nearest neighbour algorithm is a non-parametric machine learning algorithm generally used for classification. It is also known as instance based learning or lazy learning. K-NN algorithm can also be adapted for regression that is for estimating continuous variables. In this research paper the researcher endow with a generalized K-nearest algorithm used for predicting a continuous value. In or...

متن کامل

Extending the K-Nearest Neighbour Classification Algorithm to Symbolic Objects

Riassunto: L’analisi di dati simbolici generalizza alcuni metodi statistici standard al caso di oggetti simbolici (SO). Questi oggetti, informalmente definiti “dati aggregati”, poiché sintetizzano le informazioni relative ad un gruppo di individui, possono essere confrontati al fine di individuare dei cluster, di classificarli o ordinarli in base al loro grado di generalizzazione. L’articolo pr...

متن کامل

Arabic text classification using k-nearest neighbour algorithm

Many algorithms have been implemented to the problem of Automatic Text Categorization (ATC). Most of the work in this area has been carried out on English texts, with only a few researchers addressing Arabic texts. We have investigated the use of the K-Nearest Neighbour (K-NN) classifier, with an Inew, cosine, jaccard and dice similarities, in order to enhance Arabic ATC. We represent the datas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal on Advanced Science, Engineering and Information Technology

سال: 2023

ISSN: ['2088-5334', '2460-6952']

DOI: https://doi.org/10.18517/ijaseit.13.1.16706